117 research outputs found

    Database Queries that Explain their Work

    Get PDF
    Provenance for database queries or scientific workflows is often motivated as providing explanation, increasing understanding of the underlying data sources and processes used to compute the query, and reproducibility, the capability to recompute the results on different inputs, possibly specialized to a part of the output. Many provenance systems claim to provide such capabilities; however, most lack formal definitions or guarantees of these properties, while others provide formal guarantees only for relatively limited classes of changes. Building on recent work on provenance traces and slicing for functional programming languages, we introduce a detailed tracing model of provenance for multiset-valued Nested Relational Calculus, define trace slicing algorithms that extract subtraces needed to explain or recompute specific parts of the output, and define query slicing and differencing techniques that support explanation. We state and prove correctness properties for these techniques and present a proof-of-concept implementation in Haskell.Comment: PPDP 201

    Coupling Memory and Computation for Locality Management

    Get PDF
    We articulate the need for managing (data) locality automatically rather than leaving it to the programmer, especially in parallel programming systems. To this end, we propose techniques for coupling tightly the computation (including the thread scheduler) and the memory manager so that data and computation can be positioned closely in hardware. Such tight coupling of computation and memory management is in sharp contrast with the prevailing practice of considering each in isolation. For example, memory-management techniques usually abstract the computation as an unknown "mutator", which is treated as a "black box". As an example of the approach, in this paper we consider a specific class of parallel computations, nested-parallel computations. Such computations dynamically create a nesting of parallel tasks. We propose a method for organizing memory as a tree of heaps reflecting the structure of the nesting. More specifically, our approach creates a heap for a task if it is separately scheduled on a processor. This allows us to couple garbage collection with the structure of the computation and the way in which it is dynamically scheduled on the processors. This coupling enables taking advantage of locality in the program by mapping it to the locality of the hardware. For example for improved locality a heap can be garbage collected immediately after its task finishes when the heap contents is likely in cache

    Editorial:Special issue dedicated to ICFP 2010

    Get PDF

    Parallel Batch-Dynamic Trees via Change Propagation

    Get PDF
    The dynamic trees problem is to maintain a forest subject to edge insertions and deletions while facilitating queries such as connectivity, path weights, and subtree weights. Dynamic trees are a fundamental building block of a large number of graph algorithms. Although traditionally studied in the single-update setting, dynamic algorithms capable of supporting batches of updates are increasingly relevant today due to the emergence of rapidly evolving dynamic datasets. Since processing updates on a single processor is often unrealistic for large batches of updates, designing parallel batch-dynamic algorithms that achieve provably low span is important for many applications. In this work, we design the first work-efficient parallel batch-dynamic algorithm for dynamic trees that is capable of supporting both path queries and subtree queries, as well as a variety of nonlocal queries. Previous work-efficient dynamic trees of Tseng et al. were only capable of handling subtree queries [ALENEX\u2719, (2019), pp. 92 - 106]. To achieve this, we propose a framework for algorithmically dynamizing static round-synchronous algorithms to obtain parallel batch-dynamic algorithms. In our framework, the algorithm designer can apply the technique to any suitably defined static algorithm. We then obtain theoretical guarantees for algorithms in our framework by defining the notion of a computation distance between two executions of the underlying algorithm. Our dynamic trees algorithm is obtained by applying our dynamization framework to the parallel tree contraction algorithm of Miller and Reif [FOCS\u2785, (1985), pp. 478 - 489], and then performing a novel analysis of the computation distance of this algorithm under batch updates. We show that k updates can be performed in O(klog(1+n/k)) work in expectation, which matches the algorithm of Tseng et al. while providing support for a substantially larger number of queries and applications
    corecore